class: center, middle, inverse, title-slide # Lecture 2 ## New variales and Plots ### Psych 10 C ### University of California, Irvine ### 05/28/2022 --- ## Load data into R - We will keep working with the memory data from last class
--- ## Creating new variables - Sometimes we will like to work with some transformation of the variables that we have on a data file. -- - For example we could want to have a variable that tells us if the number of correctly recalled words was from the first or second test. -- This can make some plots easier to make! --- ## Creating a new variable - We will create a new variable that takes the value "test-1" if the test was perforned after 300 seconds of study and takes the value "test-2" if it was 3600 seconds after: ```r memory <- memory %>% mutate("test_id" = ifelse(test = time_test == 300, yes = "test_1", no = "test_2")) head(x = memory, n = 4) ``` ``` # A tibble: 4 × 5 id age correct time_test test_id <dbl> <dbl> <dbl> <dbl> <chr> 1 1 20 46 300 test_1 2 2 29 49 300 test_1 3 3 29 48 300 test_1 4 4 25 44 300 test_1 ``` --- ## Creating a new variable - Using the **`mutate`** function we can create new variables using other functions in R (**`ifelse`** is a function in R) -- - We will use our new variable to create plots of the variables that we are interested in. --- class: inverse, center, middle # Ploting ## Histograms --- ## Histograms - One of the ways in which we can visualize data is using histograms. -- - A histogram represents a count of the number of times that a variable has appeared in our data. --- count: false ### Histogram of correct recalls .panel1-hist-code-auto[ ```r *ggplot(data = memory) ``` ] .panel2-hist-code-auto[ <!-- --> ] --- count: false ### Histogram of correct recalls .panel1-hist-code-auto[ ```r ggplot(data = memory) + * aes(x = correct) ``` ] .panel2-hist-code-auto[ <!-- --> ] --- count: false ### Histogram of correct recalls .panel1-hist-code-auto[ ```r ggplot(data = memory) + aes(x = correct) + * aes(fill = test_id, color = test_id) ``` ] .panel2-hist-code-auto[ <!-- --> ] --- count: false ### Histogram of correct recalls .panel1-hist-code-auto[ ```r ggplot(data = memory) + aes(x = correct) + aes(fill = test_id, color = test_id) + * geom_histogram(position="identity", * binwidth = 1, * alpha = 0.4) ``` ] .panel2-hist-code-auto[ <!-- --> ] --- count: false ### Histogram of correct recalls .panel1-hist-code-auto[ ```r ggplot(data = memory) + aes(x = correct) + aes(fill = test_id, color = test_id) + geom_histogram(position="identity", binwidth = 1, alpha = 0.4) + * theme_classic() ``` ] .panel2-hist-code-auto[ <!-- --> ] --- count: false ### Histogram of correct recalls .panel1-hist-code-auto[ ```r ggplot(data = memory) + aes(x = correct) + aes(fill = test_id, color = test_id) + geom_histogram(position="identity", binwidth = 1, alpha = 0.4) + theme_classic() + * xlab("Number of correct recalls") ``` ] .panel2-hist-code-auto[ <!-- --> ] --- count: false ### Histogram of correct recalls .panel1-hist-code-auto[ ```r ggplot(data = memory) + aes(x = correct) + aes(fill = test_id, color = test_id) + geom_histogram(position="identity", binwidth = 1, alpha = 0.4) + theme_classic() + xlab("Number of correct recalls") + * ylab("Frequency") ``` ] .panel2-hist-code-auto[ <!-- --> ] --- count: false ### Histogram of correct recalls .panel1-hist-code-auto[ ```r ggplot(data = memory) + aes(x = correct) + aes(fill = test_id, color = test_id) + geom_histogram(position="identity", binwidth = 1, alpha = 0.4) + theme_classic() + xlab("Number of correct recalls") + ylab("Frequency") + * guides(fill = guide_legend("Test order"), color = "none") ``` ] .panel2-hist-code-auto[ <!-- --> ] --- count: false ### Histogram of correct recalls .panel1-hist-code-auto[ ```r ggplot(data = memory) + aes(x = correct) + aes(fill = test_id, color = test_id) + geom_histogram(position="identity", binwidth = 1, alpha = 0.4) + theme_classic() + xlab("Number of correct recalls") + ylab("Frequency") + guides(fill = guide_legend("Test order"), color = "none") + * theme(axis.title.x = element_text(size = 20), * axis.title.y = element_text(size = 20)) ``` ] .panel2-hist-code-auto[ <!-- --> ] <style> .panel1-hist-code-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-hist-code-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-hist-code-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Histograms - One of the main problems with histograms is that their shape depends on our choice of the width of the bars! -- - A change on the shape can change our interpretation of the results so we need to be careful when making our choice. -- - In general we can use histograms when we have a numeric variable that we want to visualize. --- class: inverse, center, middle # Ploting ## Box-plots --- ## Box-plots - Box plots are another common way to visualize numeric data. -- 1. Box: has 3 marks, the limits which represent the first and third quantile and the median or second quantile. -- 1. Whiskers: represent the maximum (minimum) of our observations that are lower (greater) than 1.5 times the distance between the first and third quantile. -- 1. Everything outside of those marks is considered as an outlier by the plot. -- - We can use the same data as before for an example --- count: false ### Box plot correct responses .panel1-bp-code-auto[ ```r *ggplot(data = memory) ``` ] .panel2-bp-code-auto[ <!-- --> ] --- count: false ### Box plot correct responses .panel1-bp-code-auto[ ```r ggplot(data = memory) + * aes(y = correct) ``` ] .panel2-bp-code-auto[ <!-- --> ] --- count: false ### Box plot correct responses .panel1-bp-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + * aes(x = test_id) ``` ] .panel2-bp-code-auto[ <!-- --> ] --- count: false ### Box plot correct responses .panel1-bp-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = test_id) + * aes(color = test_id) ``` ] .panel2-bp-code-auto[ <!-- --> ] --- count: false ### Box plot correct responses .panel1-bp-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = test_id) + aes(color = test_id) + * scale_color_brewer(palette="Dark2") ``` ] .panel2-bp-code-auto[ <!-- --> ] --- count: false ### Box plot correct responses .panel1-bp-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = test_id) + aes(color = test_id) + scale_color_brewer(palette="Dark2") + * geom_boxplot(fill = "white") ``` ] .panel2-bp-code-auto[ <!-- --> ] --- count: false ### Box plot correct responses .panel1-bp-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = test_id) + aes(color = test_id) + scale_color_brewer(palette="Dark2") + geom_boxplot(fill = "white") + * xlab("Test order") ``` ] .panel2-bp-code-auto[ <!-- --> ] --- count: false ### Box plot correct responses .panel1-bp-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = test_id) + aes(color = test_id) + scale_color_brewer(palette="Dark2") + geom_boxplot(fill = "white") + xlab("Test order") + * ylab("Number of correct recalls") ``` ] .panel2-bp-code-auto[ <!-- --> ] --- count: false ### Box plot correct responses .panel1-bp-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = test_id) + aes(color = test_id) + scale_color_brewer(palette="Dark2") + geom_boxplot(fill = "white") + xlab("Test order") + ylab("Number of correct recalls") + * guides(fill = "none", color = "none") ``` ] .panel2-bp-code-auto[ <!-- --> ] --- count: false ### Box plot correct responses .panel1-bp-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = test_id) + aes(color = test_id) + scale_color_brewer(palette="Dark2") + geom_boxplot(fill = "white") + xlab("Test order") + ylab("Number of correct recalls") + guides(fill = "none", color = "none") + * theme(axis.title.x = element_text(size = 20), * axis.title.y = element_text(size = 20)) ``` ] .panel2-bp-code-auto[ <!-- --> ] <style> .panel1-bp-code-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-bp-code-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-bp-code-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Box plots - Box plots show us how our data is dispersed, for example correct responses are closer together in the first test in comparison to the second. -- - We can also see that the median number of correct responses was higher on test one. -- - This can also show us if our data are dispersed symmetrically around the median value. For example is the rectangles and the whiskers before and after the median have the same size that means that they are symmetric -- - There are some variables that we would not expect to be symmetric, think about reaction times in a game. --- class: inverse, center, middle # Ploting ## Scatter plots --- ## Scatter plots - Histograms are useful when we have a single numeric variable and a categorical variable with a few categories. -- - Box plots can show us how the data behaves and will allow us to compare the distributions easily. -- - Scatter plots are useful when we wnat to see how two numerical variables "change" together. --- count: false ### Scater plot correct responses vs age .panel1-scatter-code-auto[ ```r *ggplot(data = memory) ``` ] .panel2-scatter-code-auto[ <!-- --> ] --- count: false ### Scater plot correct responses vs age .panel1-scatter-code-auto[ ```r ggplot(data = memory) + * aes(y = correct) ``` ] .panel2-scatter-code-auto[ <!-- --> ] --- count: false ### Scater plot correct responses vs age .panel1-scatter-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + * aes(x = age) ``` ] .panel2-scatter-code-auto[ <!-- --> ] --- count: false ### Scater plot correct responses vs age .panel1-scatter-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = age) + * aes(color = test_id) ``` ] .panel2-scatter-code-auto[ <!-- --> ] --- count: false ### Scater plot correct responses vs age .panel1-scatter-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = age) + aes(color = test_id) + * geom_point(fill = "white") ``` ] .panel2-scatter-code-auto[ <!-- --> ] --- count: false ### Scater plot correct responses vs age .panel1-scatter-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = age) + aes(color = test_id) + geom_point(fill = "white") + * xlab("Age") ``` ] .panel2-scatter-code-auto[ <!-- --> ] --- count: false ### Scater plot correct responses vs age .panel1-scatter-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = age) + aes(color = test_id) + geom_point(fill = "white") + xlab("Age") + * ylab("Number of correct recalls") ``` ] .panel2-scatter-code-auto[ <!-- --> ] --- count: false ### Scater plot correct responses vs age .panel1-scatter-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = age) + aes(color = test_id) + geom_point(fill = "white") + xlab("Age") + ylab("Number of correct recalls") + * guides(fill = "none", color = guide_legend("Test order")) ``` ] .panel2-scatter-code-auto[ <!-- --> ] --- count: false ### Scater plot correct responses vs age .panel1-scatter-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = age) + aes(color = test_id) + geom_point(fill = "white") + xlab("Age") + ylab("Number of correct recalls") + guides(fill = "none", color = guide_legend("Test order")) + * theme(axis.title.x = element_text(size = 20), * axis.title.y = element_text(size = 20)) ``` ] .panel2-scatter-code-auto[ <!-- --> ] --- count: false ### Scater plot correct responses vs age .panel1-scatter-code-auto[ ```r ggplot(data = memory) + aes(y = correct) + aes(x = age) + aes(color = test_id) + geom_point(fill = "white") + xlab("Age") + ylab("Number of correct recalls") + guides(fill = "none", color = guide_legend("Test order")) + theme(axis.title.x = element_text(size = 20), axis.title.y = element_text(size = 20)) + * geom_smooth(method = lm) ``` ] .panel2-scatter-code-auto[ <!-- --> ] <style> .panel1-scatter-code-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-scatter-code-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-scatter-code-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style>